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Appendices 


Structure/Function  Studies  of  Androgen  Receptor 

DNA  Binding  Region 


Introduction 

The  Androgen  Receptor: 

Androgens  (testosterone,  dihydrotestosterone)  are  steroids  with  a  key  role  in  promoting 
normal  sex  differentiation  and  development,  pubertal  masculinization,  initiation  of 
spermatogenesis,  and  maintenance  of  male  sexual  function  [1-3].  There  is  evidence  for 
androgen  production  in  the  testes,  skin,  and  submaxillary  glands,  although  its  levels  are 
most  abundant  in  the  prostate  gland.  The  role  of  androgens  in  the  control  of  normal  and 
abnormal  sexual  development  in  humans  and  other  vertel)rates  has  been  studied  since 
1935-  when  the  chemical  structure  of  testosterone  was  first  elucidated  [1].  The  androgen 
receptor  (AR)  is  a  ligand-activated  transcription  factor  and  a  member  of  a  large 
superfamily  of  related  receptors  [4-7]. 

All  the  regulatory  actions  produced  by  androgens  occur  at  the  level  of 
transcription  initiation  in  which  a  hierarchy  of  intermolecular  interactions  are  based  on 
the  DNA-bound  complex  of  the  androgen  receptor  (AR)  [4-9].  The  AR  gene  contains  an 
open  reading  frame  separated  over  eight  exons,  producing  a  protein  of  919  amino-acids  in 
humans  with  four  discrete  functional  domains,  shown  in  Figure  2  [2,3,10].  The  DNA- 
binding  domain  (DBD)  forms  into  a  homodimer,  but  only  on  the  specific  DNA  binding 
sites  of  AR.  A  similar  DNA-dependent  dimerization  mode  is  used  by  other  steroid 
receptors  and  many  other  members  of  the  nuclear  receptor  superfamily,  although  for  non¬ 
steroid  receptors  the  DBDs  can  also  form  heterodimers  with  the  9-cis  retinoic  acid 
receptor  (RXR)  [6]. 

The  core  DBD  of  nuclear  receptors  are  highly  conserved  across  the  nuclear 
receptor  family  (typically  50-60%  identical  in  amino-acid  sequence).  Because  this  region 
is  so  conserved,  the  origins  of  DNA-binding  specificity  have  remained  somewhat  elusive 
until  the  recent  structure  determination  of  related  receptor  DNA-binding  complexes.  Our 
own  laboratory’s  crystal  structures  of  related  DBDs  have  shown  a  special  role  for  the 
hinge  region,  which  imparts  selectivity  in  DNA  minor  groove  recognition  and  also 
provides  the  unique  DNA-dependent  dimerization  in  some  cases  [11-14].  This  region  is 
not  conserved  in  size  or  sequence  across  the  superfamily,  further  underscoring  its  unique 
role  in  specifying  the  DNA-binding  site  for  each  particular  receptor. 


Exons 


Fiaure  1:  The  Domain  and  Exon  representation  of  the  human  Androgen  Receptor.  Two  independent  and 
separable  functions  of  the  nuclear  receptors  are:  the  DNA-binding  domain  (DBD)  confers  response  element 
recognition,  and  the  Ligand-Binding  domain  (LBD)  binds  the  steroid  hormone  (shown  schematically). 


The  Androgen  Response  Elements: 

All  the  steps  in  androgen  signaling,  occur  only  through  the  DNA-bound  complex  of  this 
receptor  [2,4,5].  The  specific  DNA-binding  sequences  in  the  genome  are  known  as 
response  elements.  A  number  of  such  genes  have  been  shown  to  contain  AR  DNA 
response  elements  in  upstream  regulatory  sites.  The  AR  response  element  DNA  is  a 
nearly  palindromic  sequences,  which,  as  expected,  accommodates  the  homodimeric  form 
of  AR  DBDs,  and  is  superficially  related  to  the  response  sites  of  the  glucocorticoid  and 
estrogen  receptors.  In  these  response  sites,  two  consensus  hexameric  stretches  of  the 
sequence  5’AGAACA-3’  are  separated  by  three  base  pairs  [2,4,5],  However,  naturally 
occurring  androgen  response  sites  usually  contain  significant  deviations  from  the 
consensus  half-site  sequences,  and  also  contain  unique  sequences  in  the  flanking  regions 
which  increase  the  binding  affinity  and  allow  target  site  selectivity  amongst  the  related 
steroid  receptors  [15].  Such  androgen  response  elements  have  been  identified  upstream 
of  the  probasin  gene  and  also  the  prostate  specific  antigen  (PSA,  which  is  also  used  as  the 
diagnostic  marker  for  prostate  cancer  [16]. 

AR  DNA-BINDING  DOMAINS  +/-  HINGE 


Figure  2.  The  AR  core  DBD  containing  a  C-terminal  extention  into  the  receptor’s  hinge 
region.  Residues  numbered  from  I  to  66  represent  the  core  DBD,  and  residues  beyond  66 
are  from  the  hinge  region  of  the  receptor  (shown  schematically  in  Figure  1). 
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Goals  and  Key  Research  Accomplishments  over  the  previous  Year 


One  of  our  two  primary  aims  has  been  to  work  toward  crystallization  and  X-ray  crystal 
structure  determination  of  the  minimal  androgen  receptor  (AR)  DNA-binding  Domain 
(DBD)  in  complex  with  a  19  base  pair  idealized  DNA  target.  It  is  important  to  note  that 
the  DNA  target  used  here  was  quite  similar  to  the  binding  site  generally  attributed  for 
other  steroid  receptors  such  as  the  glucocorticoid  and  mineralcorticoid  receptors,  and  thus 
is  likely  not  to  provide  all  the  information  about  the  target  selectivity  of  AR.  For  this 
reason,  we  have  a  second  goal  (described  below)  that  will  address  this  specific  question 
more  thoroughly.  Nevertheless,  we  have  been  able  to  grow  these  crystals  (Figure  3)  and 
improve  them  in  terms  of  their  size  and  quality.  We  have  tested  several  cryogen 
protectants  for  use  in  synchrotron  data  collection  under  cryogenic  temperatures  (where 
we  are  likely  to  achieve  the  best  diffraction),  and  have  also  scheduled  a  synchroton  trip 
for  later  this  year  for  collecting  a  complete  diffraction  dataset.  To  assist  our  ability  to 
solve  the  structures,  we  have  also  devised  a  number  of  useful  search  models  for  use  by 
molecular  replacement.  These  search  models  consist  of  common  sequences  shared  by 
other  DBDs  whose  structures  we  have  previously  solved  in  the  laboratory.  Other,  AR 
specific  amino-acids  are  trimmed  to  alanines  to  assist  in  the  molecular  replacement 
search.  We  have  tested  this  strategy  on  other  DBD/DNA  complexes  involving  nuclear 
receptors  and  believe  that  it  is  likely  to  prove  successful  for  solving  the  structure  of  our 
minimal  AR  DBD/DNA  complex. 


Figure  3:  Crystals  of  our  AR  DBD  complex  with  a  19  mer  DNA  duplex.  The  DNA  is  an 
idealized  steroid  response  element  composed  of  two  AGAACA  sequences  arranged 
symmetrically  about  a  three  base-pair  spacer.  The  DBD  is  a  minimal  sequence  composed 
mainly  of  the  AR  core  sequence. 


The  second  aim  of  the  proposal  is  to  pursue  crystals  of  other  DNA  binding  complexes  of 
AR  which  are  more  informative  in  terms  of  binding  selectivity  of  this  receptor  with 
respect  to  glucocorticoid,  mineralcorticoid,  and  other  steroid  receptors.  We  have  taken  a 
two  tiered  approach  in  our  initial  biochemical  studies  towards  identifying  the  most  useful 
candidates  for  structural  examination.  First,  we  have  made  a  series  of  AR  DNA  binding 
regions  in  which  the  core  DBD  is  extended  at  its  C-terminal  end  with  various  sized  hinge 
region  segments  in  order  to  map  out  more  precisely  the  boundaries  of  the  protein  to  be 
used  in  crystallization.  Based  on  previous  work  we  and  others  have  carried  out  with 
nuclear  receptor  DBDs,  we  strongly  believe  that  additional  sequences  beyond  the  core 
AR  DBD  are  likely  to  have  major  consequences  in  terms  of  target  DNA  selectivity  and 
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cooperative  homodimerization.  Figure  3  shows  the  extension  of  the  core  DBD  in  the 
case  of  the  androgen  receptor  (residues  beyond  66),  and  Figure  4a  shows  the  specific 
constructs  that  we  made  for  over-expression  in  E.coli.  The  five  constructs  that  we  now 
have  all  share  a  common  N-terminal  (His)6  tag  for  purification,  the  67  residue  core  DBD 
region,  plus  0,  10,  20,  30, 41  residues  of  hinge  region  residues  at  their  C-terminus, 
respectively.  The  constructs  were  generated  by  PCR  and  cloned  into  pET16b  vector,  and 
expressed  in  BL21-DE3  strain.  A  Ni-NTA  column  and  an  S  column  are  used  in 
succession  to  purify  the  proteins  (all  of  which  proved  to  be  in  the  soluble  fraction  of 
E.coli).  Figures  4b-c  shows  an  example  of  how  each  of  these  constructs  can  be  purified  to 
homogeneity.  All  of  this  work  represents  new  achievements  in  the  past  twelve  months, 
and  will  significantly  guide  us  in  generating  the  most  important  and  useful  co-crystal 
structures  in  the  upcoming  two  years  for  meeting  our  overall  goals. 


Construct  1: 
Construct  2; 
Construct  3: 
Construct  4; 
Construct  3; 


.vC-HHHHH-lFij:<TCLlD3B3:=-i:'5::iYG.UTCGSi::K’;FirKPaV\E 

^r-riElH:i:'t-lP;iKlCLICGDiiiS02:LYG.\LlCGSC<';FFK^AAE3KaKVL£AS?-‘41X:r 

S£HHH3:mE«<TCLICGIi31=!SCCilYa\LlCGSCK’;FI?bmA£3KQKrti:j\SF-NW^^^^ 

^r-HHHaH:HF^:<TCLICG32AS03ilYGiXl>:?3£CK'/FFKjy^3K'2KlfLi^ 

^^C4^HH:^H:lEC^<lCLICGI'L.ii;0■r:lYGALICG5'C}i.VFF^t^AM;3.KC!KrlCA 


Construct  1;  iCi<-i!aKWCF3C?lPKCYE?-.GH 
Construct  2;  ID?IJP3i<t»CPSC?X?j^CYEftGMTLi3W!XL5IKL 
Cion.struct  i  J  ryBrit'CP5C?.l»?i'.CYi?i3-n'LG?!33En5',i',LGJ-LKt/3tiC'jE 
construct  4; 

Ccn.3truct  .3 !  Ir^^F^:'30:P5C^^IJ>:CYI^/3•f^Ir^.“il:'aY-C^7X;>JC:<LQEE■3E^1SSA3S?^EDP£Q:^^^^ 

0  ■ffo  +>11  iii 


Figure  4a:  Five  AR  DNA  binding  constructs  that  were  made  with  varying  length  of  hinge  region. 
We  are  testing  each  of  these  using  electrophoretic  mobility  shift  assays  to  characterize  their 
binding  to  various  AR  response  elements.  The  MGHHHHHH  sequence  at  the  beginning  of  each 
construct  represents  a’  his-tag  sequence’  that  is  added  for  ease  of  protein  purification.  The 
methionines  residue  at  the  end  of  construct  1  represents  residue  66  in  Figure  2,  and  constructs  2-5 
contain  an  additional  10-40  residues  from  the  AR  hinge  region. 


Ki; 

IJT  K|) 


I  A  3j 

Fig  4.  Construct  5  eluted  from 
Ni-NTA  column  Lane  1, 2: 
fractions  eluted  from  Ni-NTA 
column.  Lane  3:  Molecular 
weight  marker 


!':i  oi-^  ’ 


Fig  4.  Construct  5 
eluted  from  S 
column.  Lane  1,  2>  3: 
fractions  eluted  from 
S  column;  Lane  4: 
Molecular  weight 
marker. 
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The  second  approach  we  have  taken  is  to  find  the  best  candidates  for  structural  studies  is 
to  examine  androgen-responsive  elements  which  represent  highly  selected  and  natural 
targets  for  AR.  Unlike  the  DNA  contained  in  crystals  described  in  Aim  1,  the  naturally 
occurring  androgen  response  elements  in  this  aim  contain  a)  significant  differences  from 
the  consensus  steroid  element,  b)  additional  flanking  sequences  outside  of  the  core 
symmetric  sequence,  and  c)  less  extensive  two-fold  symmetry.  Therefore,  a  structural 
such  an  additional  complex,  in  comparison  with  the  structure  we  are  pursuing  in  Aim  1, 
will  likely  teach  us  considerable  new  lessons  about  AR  DNA  binding  specificity  and 
selectivity. 

A  recent  report  (By  Dr.  Colleen  Nelson  and  colleagues  )  characterized  several  different 
types  of  androgen  response  elements  that  occur  in  the  promoter  of  an  important  androgen 
responsive  gene,  the  probasin  gene  (see  Figure  5).  These  response  elements,  together 
with  those  identified  earlier  in  the  PSA  promoter,  can  ail  be  categorized  into  basically 
two  major  classes.  Class  I  sequences  are  more  typical  of  conventional  steroid  response 
elements  with  the  sequence  RGAACA-NGN-TGTNCT.  Class  II  AREs  were  only 
discovered  by  methylation  protection  assay  in  the  presence  of  androgen  receptor  and  do 
not  share  all  the  hallmarks  of  Class  I.  The  class  II  consensus  sequence  is  RGGACA- 
NNA-AGCCAA.  It  has  been  suggested  that  appropriate  combination  of  class  I  and  class 
II  AREs,  as  that  happened  in  vivo,  can  lead  to  allosteric  or  perhaps  differential  binding. 
So  we  expend  to  expand  our  studies  by  using  the  five  protein  constructs  in  Figure  4a  and 
the  two  major  classes  of  naturally  occurring  AREs  in  Figure  6,  to  identify  the  most 
functionallhy  revealing  constructs  for  our  upcoming  c-cyrstallization  trials. 

AREl  G1  ARE2  G2 


Fig  5.  Probasin  Promoter  Structure.  There  are  4  different  AREs:  AREl,  Gl,  ARE2  and  G2  over 
the  -268  to  -76  region  of  the  promoter. 

-«»GGGACA  -  TAA  -  AGCCCA* 

*  ATGACA  -  CAA  -  TGTCAA* 

GGGACA-ACT-TGCAAA  *^^® 

aggaca-gta-agcaag**®*® 

AGATCA-TGA-AGATAA-^** 

♦^•^AGAACT- GGC- 


Parobasin  G-1 
Pirobasizi  G-2 
PSA  Etobancer-  V 
PSA  Bnbancojr  IIIA 
PSA  Exibancair  IV 
SIrP  2 


CONSENSUS  CIASS  II 
CONSENSUS  CI^SS  I 


Parobaaln  APEX 
Pirobaslzi  APE2 
PSA  Enbancezr 
P5A  APJ2 
PSA  APE 
SI»P  3 


R^ACA- NNA- AGCCAA 
-7  -2  O  +2  +7 

RGAACA  -  NGN  -  TGTNCT 

o  o  o 

-”^ATAGCA-TCT  -  TGTTCT-*^’ 

-  «•  AGTACT  -  CCA  -  AGAACC' 
GGAACA-TAT-TGTATT"*^* 
■^•OGGATCA-Gaa-AcrrcTcr^^* 
-^«’AGAACA-GCA  -AGTGCtr 
♦i«AGAACA-GGC”  TGTTTCr^^^ 


Fig  6.  Class  I  and  Class  II  AREs  that  we  are  pursuing  in  our  studies. 

To  assay  protein-DNA  binding,  we  have  been  relying  on  a  sensitive  electrophoresis 
mobility  shift  assay  (EMSA).  So  far,  four  different  simple  AREs  from  probasin  promoter, 
including  two  class  I  (AREl,  ARE2),  and  two  class  II  (G-1,  G-2)  have  been  synthesized. 
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in  each  case  having  one  strand  fluorescein  end-labeled.  For  comparison,  the  highest 
affinity  but  unnatural  simple  ARE  GGTACAnnnTGTTCT  has  also  synthesized  and  end- 
labeled.  We  have  carried  out  several  EMSA  assays  and  are  working  to  complete  our 
studies.  In  each  case,  to  determine  their  relative  binding  affinity,  constant  amount  of 
DNA  plus  increasing  amount  of  protein  is  used  (see  Figure  7  for  example). 


Figure  7.  EMSA  experiment  characterizing  the  efficiency  of  AR  construct  5  (figure  4) 
binding  to  a  type  II  ARE.  Lane  1 ;  Free  DNA  (ARE2;  40nM);  lane  2:  DNA  plus  1:15  (mol 
ratio)  protein;  lane  3:  DNA  plus  1:15  protein  plus  2uM  poly  dldC;  lane  4:  DNA  plus  1:30 
protein.  Noted  the  binding  affinity  is  very  low  here,  may  need  to  optimize  binding  condition. 

Reportable  Outcomes 

Five  new  over-expression  constructs  for  producing  the  AR  DNA  binding  region. 


Conclusions 

We  are  in  a  position  to  determine  important  co-crystal  structures  of  the  Androgen 
receptor  DNA  binding  complex  with  one  or  more  response  elements.  Once  we  identify 
the  most  important  crystallization  targets  and  determine  their  structure,  we  will  be  well 
positioned  to  understand  a)  structure/function  relationships  in  terms  of  the  AR  DNA 
binding  region  and  androgen  response  elements,  b)  the  basis  for  certain  androgen 
insensitivity  syndrome  mutations  that  fall  in  the  DNA-binding  domain,  and  c)  the 
possibility  of  whether  the  protein-DNA  complex  can  be  viewed  as  a  useful  drug  target. 

Our  goal  for  the  next  twelve  months  is  to  complete  our  biochemical  studies  on  protein 
and  DNA  constructs,  identify  the  high  affinity  complexes  that  are  likely  to  yield 
additional  useful  crystals,  and  work  toward  the  structure  determination.  If  our  EMSA 
assays  indicate  that  various  different  response  elements  show  high  affinity  binding  to  AR 
DNA  binding  regions,  we  are  likely  to  pursue  all  of  these  for  structure  determination  to 
understand  how  class  I,  class  II  and  the  consensus  ARE  differ  in  their  protein-DNA 
contacts. 
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